-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: shutdown race resilience #141
Conversation
c1b9be1
to
6137611
Compare
f043d84
to
5f8c4ef
Compare
e7df41a
to
c8045ad
Compare
For some reason, I need to set up the loadConfig tests with tests := map[string]struct {
setEnv func(*testing.T)
expected config
}{
"defaults": {
+ setEnv: func(t *testing.T) {
+ t.Helper()
+ t.Setenv("RYUK_VERBOSE", "false")
+ },
expected: config{
ConnectionTimeout: time.Minute,
Port: 8080,
ReconnectionTimeout: time.Second * 10,
RemoveRetries: 10,
RequestTimeout: time.Second * 10,
RetryOffset: -time.Second,
ShutdownTimeout: time.Minute * 10,
},
},
"custom": {
setEnv: func(t *testing.T) {
t.Helper()
t.Setenv("RYUK_PORT", "1234")
t.Setenv("RYUK_CONNECTION_TIMEOUT", "2s")
t.Setenv("RYUK_RECONNECTION_TIMEOUT", "3s")
t.Setenv("RYUK_REQUEST_TIMEOUT", "4s")
t.Setenv("RYUK_REMOVE_RETRIES", "5")
t.Setenv("RYUK_RETRY_OFFSET", "-6s")
t.Setenv("RYUK_SHUTDOWN_TIMEOUT", "7s")
+ t.Setenv("RYUK_VERBOSE", "false")
},
expected: config{
Port: 1234,
ConnectionTimeout: time.Second * 2,
ReconnectionTimeout: time.Second * 3,
RequestTimeout: time.Second * 4,
RemoveRetries: 5,
RetryOffset: -time.Second * 6,
ShutdownTimeout: time.Second * 7,
},
},
} A bug in the env library for default booleans? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for opening this @stevenh. I see a lot of changes here, which I'd suggest splitting in different PRs:
- config refactor
- run function
Ryuk is a critical piece in the Testcontainers ecosystem, so updating it should be done very carefully to not affect all the related projects.
Would you mind separating this big PR into multiple ones? I can help you out with it if needed.
Thanks!
Interesting what's the error you get? |
2169978
to
654f1a6
Compare
If you had environment vars set then the test would have failed as it was expecting a clean environment. I've updated the test to clean the current environment before it starts, which could fix the issue you saw. |
Hi @stevenh, thanks for this PR, I've started the review and I see that there are some changes that would deserve its own PR:
Could you separate them into three? I can help out with the two first so you can focus on the real meat here |
c8d5692
to
f894119
Compare
Ok so little more involved due to go bump needed new linter so we need to merge in the following order:
I didn't consider other env config libraries, as I've used that one a bunch of times and has worked well, but if there is something better happy to swap. |
Given #159 has been merged, could you please rebase this one? 🙏 |
a87ce57
to
6c0b5b2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
I like the changes about the vocabulary: I'll miss the ryuk names, although I admit that using the names and words in this PR is now more inclusive and more maintainable for the long term, so thanks for this.
Once merged, as we are building the images for the commit, we can immediately test this image in testcontainers-go, using the commit-sha as docker tag (see https://hub.docker.com/r/testcontainers/ryuk/tags)
1824b2a
to
327aa4e
Compare
@mdelapenya added some docker cli based tests for container, network, image and volume reaping so should be good to go. |
327aa4e
to
f9a72c3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment regarding shelling out to the docker binary.
Other than that, will take the final view this weekend for final approval 🙏 I'm doing conferences next week, so I'll try to do before that.
40b4aed
to
b08d2e0
Compare
A significant rewrite to ensure that we don't suffer from shutdown race conditions as the prune condition is met and additional resources are being created. Previously this would remove resources that were still in use, now we retry if we detect new resources have been created within a window of the prune condition triggering. This supports the following new environment configuration settings: - RYUK_REMOVE_RETRIES - The number of times to retry removing a resource. - RYUK_REQUEST_TIMEOUT - The timeout for any Docker requests. - RYUK_RETRY_OFFSET - The offset added to the start time of the prune pass that is used as the minimum resource creation time. - RYUK_SHUTDOWN_TIMEOUT - The duration after shutdown has been requested when the remaining connections are ignored and prune checks start. Update README to correct example, as health is only valid for containers not the other resources, so would cause failures. Enable race detection on CI tests.
b08d2e0
to
01bc3c8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, great PR, thanks for adding clarity in the reap process.
Cheers!
A significant rewrite to ensure that we don't suffer from shutdown race conditions as the prune condition is met and additional resources are being created.
Previously this would remove resources that were still in use, now we retry if we detect new resources have been created within a window of the prune condition triggering.
This supports the following new environment configuration settings: